{
  "nbformat": 4,
  "nbformat_minor": 0,
  "metadata": {
    "colab": {
      "name": "aXeleRate_pascal20_detector.ipynb",
      "provenance": [],
      "private_outputs": true,
      "collapsed_sections": [],
      "mount_file_id": "1_yhmzOZKns_-h0GwyPu9YAT3K0WQ1PG8",
      "authorship_tag": "ABX9TyPliRCW4lPH/Pzj72qrwBJY",
      "include_colab_link": true
    },
    "kernelspec": {
      "name": "python3",
      "display_name": "Python 3"
    },
    "accelerator": "GPU"
  },
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "view-in-github",
        "colab_type": "text"
      },
      "source": [
        "<a href=\"https://colab.research.google.com/github/AIWintermuteAI/aXeleRate/blob/master/resources/aXeleRate_pascal20_detector.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "hS9yMrWe02WQ",
        "colab_type": "text"
      },
      "source": [
        "## PASCAL-VOC Detection model Training and Inference\n",
        "\n",
        "In this notebook we will use axelerate, Keras-based framework for AI on the edge, to quickly setup model training and then after training session is completed convert it to .tflite and .kmodel formats.\n",
        "\n",
        "First, let's take care of some administrative details. \n",
        "\n",
        "1) Before we do anything, make sure you have choosen GPU as Runtime type (in Runtime - > Change Runtime type).\n",
        "\n",
        "2) We need to mount Google Drive for saving our model checkpoints and final converted model(s). Press on Mount Google Drive button in Files tab on your left. \n",
        "\n",
        "In the next cell we clone axelerate Github repository and import it. \n",
        "\n",
        "**It is possible to use pip install or python setup.py install, but in that case you will need to restart the enironment.** Since I'm trying to make the process as streamlined as possibile I'm using sys.path.append for import."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "y07yAbYbjV2s",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "%tensorflow_version 1.x\n",
        "#we need imgaug 0.4 for image augmentations to work properly, see https://stackoverflow.com/questions/62580797/in-colab-doing-image-data-augmentation-with-imgaug-is-not-working-as-intended\n",
        "!pip uninstall -y imgaug && pip uninstall -y albumentations && pip install imgaug==0.4\n",
        "!git clone https://github.com/AIWintermuteAI/aXeleRate.git\n",
        "import sys\n",
        "sys.path.append('/content/aXeleRate')\n",
        "from axelerate import setup_training,setup_inference"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "5TBRMPZ83dRL",
        "colab_type": "text"
      },
      "source": [
        "At this step you typically need to get the dataset. You can use !wget command to download it from somewhere on the Internet or !cp to copy from My Drive as in this example\n",
        "```\n",
        "!cp -r /content/drive/'My Drive'/pascal_20_segmentation.zip .\n",
        "!unzip --qq pascal_20_segmentation.zip\n",
        "```\n",
        "For this notebook we will use PASCAL-VOC 2012 object detection dataset, which you can download here:\n",
        "\n",
        "http://host.robots.ox.ac.uk:8080/pascal/VOC/voc2012/index.html#devkit\n",
        "\n",
        "I split the dataset into training and validation using a simple Python script. Since most of the models trained with aXeleRate are to be run on embedded devices and thus have memory and latency constraints, the validation images are easier than most of the images in training set. The validation images include one(or many) instance of a particular class, no mixed classes in one image.\n",
        "\n",
        "Let's visualize our detection model test dataset. We use img_num=10 to show only first 10 images. Feel free to change the number to None to see all 100 images.\n"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "_tpsgkGj7d79",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "%matplotlib inline\n",
        "!gdown https://drive.google.com/uc?id=1xgk7svdjBiEyzyUVoZrCz4PP6dSjVL8S  #pascal-voc dataset\n",
        "!gdown https://drive.google.com/uc?id=1-ccBXBEUhyzG2_jopf6d13hTH9X7qpNl  #pre-trained model\n",
        "!unzip --qq pascal_20_detection.zip\n",
        "\n",
        "from axelerate.networks.common_utils.augment import visualize_detection_dataset\n",
        "\n",
        "visualize_detection_dataset(img_folder='pascal_20_detection/imgs_validation', ann_folder='pascal_20_detection/anns_validation', num_imgs=10, img_size=224, jitter=True)"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "S1oqdtbr7VLB",
        "colab_type": "text"
      },
      "source": [
        "Next step is defining a config dictionary. Most lines are self-explanatory.\n",
        "\n",
        "Type is model frontend - Classifier, Detector or Segnet\n",
        "\n",
        "Architecture is model backend (feature extractor) \n",
        "\n",
        "- Full Yolo\n",
        "- Tiny Yolo\n",
        "- MobileNet1_0\n",
        "- MobileNet7_5 \n",
        "- MobileNet5_0 \n",
        "- MobileNet2_5 \n",
        "- SqueezeNet\n",
        "- NASNetMobile\n",
        "- DenseNet121\n",
        "- ResNet50\n",
        "\n",
        "For more information on anchors, please read here\n",
        "https://github.com/pjreddie/darknet/issues/568\n",
        "\n",
        "Labels are labels present in your dataset.\n",
        "IMPORTANT: Please, list all the labels present in the dataset.\n",
        "\n",
        "object_scale determines how much to penalize wrong prediction of confidence of object predictors\n",
        "\n",
        "no_object_scale determines how much to penalize wrong prediction of confidence of non-object predictors\n",
        "\n",
        "coord_scale determines how much to penalize wrong position and size predictions (x, y, w, h)\n",
        "\n",
        "class_scale determines how much to penalize wrong class prediction\n",
        "\n",
        "For converter type you can choose the following:\n",
        "\n",
        "'k210', 'tflite_fullint', 'tflite_dynamic', 'edgetpu', 'openvino', 'onnx'\n",
        "\n",
        "**Since it is an example notebook, we will use pretrained weights and set all layers of the model to be \"frozen\"(non-trainable).** "
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "Jw4q6_MsegD2",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "config = {\n",
        "        \"model\":{\n",
        "            \"type\":                 \"Detector\",\n",
        "            \"architecture\":         \"MobileNet1_0\",\n",
        "            \"input_size\":           224,\n",
        "            \"anchors\":              [0.57273, 0.677385, 1.87446, 2.06253, 3.33843, 5.47434, 7.88282, 3.52778, 9.77052, 9.16828],\n",
        "            \"labels\":               [\"person\", \"bird\", \"cat\", \"cow\", \"dog\", \"horse\", \"sheep\", \"aeroplane\", \"bicycle\", \"boat\", \"bus\", \"car\", \"motorbike\", \"train\",\"bottle\", \"chair\", \"diningtable\", \"pottedplant\", \"sofa\", \"tvmonitor\"],\n",
        "            \"coord_scale\" : \t\t1.0,\n",
        "            \"class_scale\" : \t\t1.0,\n",
        "            \"object_scale\" : \t\t5.0,\n",
        "            \"no_object_scale\" : \t1.0\n",
        "        },\n",
        "        \"weights\" : {\n",
        "            \"full\":   \t\t\t\t\"/content/2020-04-12_17-09-43.h5\",\n",
        "            \"backend\":   \t\t    \"imagenet\"\n",
        "        },\n",
        "        \"train\" : {\n",
        "            \"actual_epoch\":         1,\n",
        "            \"train_image_folder\":   \"pascal_20_detection/imgs\",\n",
        "            \"train_annot_folder\":   \"pascal_20_detection/anns\",\n",
        "            \"train_times\":          1,\n",
        "            \"valid_image_folder\":   \"pascal_20_detection/imgs_validation\",\n",
        "            \"valid_annot_folder\":   \"pascal_20_detection/anns_validation\",\n",
        "            \"valid_times\":          1,\n",
        "            \"valid_metric\":         \"mAP\",\n",
        "            \"batch_size\":           32,\n",
        "            \"learning_rate\":        1e-4,\n",
        "            \"saved_folder\":   \t\tF\"/content/drive/My Drive/pascal20_detection\",\n",
        "            \"first_trainable_layer\": \"reshape_1\",\n",
        "            \"augumentation\":\t\t\t\tFalse,\n",
        "            \"is_only_detect\" : \t\tFalse\n",
        "        },\n",
        "        \"converter\" : {\n",
        "            \"type\":   \t\t\t\t[\"k210\"]\n",
        "        }\n",
        "    }"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "kobC_7gd5mEu",
        "colab_type": "text"
      },
      "source": [
        "Let's check what GPU we have been assigned in this Colab session, if any."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "rESho_T70BWq",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "from tensorflow.python.client import device_lib\n",
        "device_lib.list_local_devices()"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "cWyKjw-b5_yp",
        "colab_type": "text"
      },
      "source": [
        "Finally we start the training by passing config dictionary we have defined earlier to setup_training function. The function will start the training with  Reduce Learning Rate on Plateau and save on best mAP callbacks. Every epoch mAP of the model predictions is measured on the validation dataset. If you have specified the converter type in the config, after the training has stopped the script will convert the best model into the format you have specified in config and save it to the project folder.\n",
        "\n",
        "Let's train for one epoch to see how the whole pipeline works."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "deYD3cwukHsj",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "from keras import backend as K \n",
        "K.clear_session()\n",
        "model_path = setup_training(config_dict=config)"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "ypTe3GZI619O",
        "colab_type": "text"
      },
      "source": [
        "After training it is good to check the actual perfomance of your model by doing inference on your validation dataset and visualizing results. This is exactly what next block does. Our model used pre-trained weights and since all the layers were set as non-trainable, we are just observing the perfomance of the model that was trained before."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "jE7pTYmZN7Pi",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "%matplotlib inline\n",
        "from keras import backend as K \n",
        "K.clear_session()\n",
        "setup_inference(config, model_path)"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "nKsxhdPvzrD8",
        "colab_type": "text"
      },
      "source": [
        "If you need to convert trained model to other formats, for example for inference with Edge TPU or OpenCV AI Kit, you can do it with following commands. Specify the converter type, backend and folder with calbiration images(normally your validation image folder)."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "awR7r4ILzrmb",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "from axelerate.networks.common_utils.convert import Converter\n",
        "converter = Converter('openvino', 'MobileNet1_0', 'pascal_20_detection/imgs_validation')\n",
        "converter.convert_model(model_path)"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "JPvYzcRhfs2u",
        "colab_type": "text"
      },
      "source": [
        "To train the model from scratch use the following config and then run the cells with training and (optinally) inference functions again."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "uruWpeGRf6Qi",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "config = {\n",
        "        \"model\":{\n",
        "            \"type\":                 \"Detector\",\n",
        "            \"architecture\":         \"MobileNet1_0\",\n",
        "            \"input_size\":           224,\n",
        "            \"anchors\":              [0.57273, 0.677385, 1.87446, 2.06253, 3.33843, 5.47434, 7.88282, 3.52778, 9.77052, 9.16828],\n",
        "            \"labels\":               [\"person\", \"bird\", \"cat\", \"cow\", \"dog\", \"horse\", \"sheep\", \"aeroplane\", \"bicycle\", \"boat\", \"bus\", \"car\", \"motorbike\", \"train\",\"bottle\", \"chair\", \"diningtable\", \"pottedplant\", \"sofa\", \"tvmonitor\"],\n",
        "            \"coord_scale\" : \t\t1.0,\n",
        "            \"class_scale\" : \t\t1.0,\n",
        "            \"object_scale\" : \t\t5.0,\n",
        "            \"no_object_scale\" : \t1.0\n",
        "        },\n",
        "        \"weights\" : {\n",
        "            \"full\":   \t\t\t\t\"\",\n",
        "            \"backend\":   \t\t    \"imagenet\"\n",
        "        },\n",
        "        \"train\" : {\n",
        "            \"actual_epoch\":         100,\n",
        "            \"train_image_folder\":   \"pascal_20_detection/imgs\",\n",
        "            \"train_annot_folder\":   \"pascal_20_detection/anns\",\n",
        "            \"train_times\":          1,\n",
        "            \"valid_image_folder\":   \"pascal_20_detection/imgs_validation\",\n",
        "            \"valid_annot_folder\":   \"pascal_20_detection/anns_validation\",\n",
        "            \"valid_times\":          1,\n",
        "            \"valid_metric\":         \"mAP\",\n",
        "            \"batch_size\":           32,\n",
        "            \"learning_rate\":        1e-4,\n",
        "            \"saved_folder\":   \t\tF\"/content/drive/My Drive/pascal20_detection\",\n",
        "            \"first_trainable_layer\": \"\",\n",
        "            \"augumentation\":\t\t\t\tFalse,\n",
        "            \"is_only_detect\" : \t\tFalse\n",
        "        },\n",
        "        \"converter\" : {\n",
        "            \"type\":   \t\t\t\t[\"k210\",\"tflite\"]\n",
        "        }\n",
        "    }"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "1frVrWMcf-k7",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "from keras import backend as K \n",
        "K.clear_session()\n",
        "model_path = setup_training(config_dict=config)"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "Ipv1AGzRgAMA",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "%matplotlib inline\n",
        "from keras import backend as K \n",
        "K.clear_session()\n",
        "setup_inference(config, model_path)"
      ],
      "execution_count": null,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "5YuVe2VD11cd",
        "colab_type": "text"
      },
      "source": [
        "Good luck and happy training! Have a look at these articles, that would allow you to get the most of Google Colab or connect to local runtime if there are no GPUs available;\n",
        "\n",
        "https://medium.com/@oribarel/getting-the-most-out-of-your-google-colab-2b0585f82403\n",
        "\n",
        "https://research.google.com/colaboratory/local-runtimes.html"
      ]
    }
  ]
}